Extending the SOM Algorithm to Non-Euclidean Distances via the Kernel Trick
نویسندگان
چکیده
The Self Organizing Map is a nonlinear projection technique that allows to visualize the underlying structure of high dimensional data. However, the original algorithm relies on the use of Euclidean distances which often becomes a serious drawback for a number of real problems. In this paper, we present a new kernel version of the SOM algorithm that incorporates non-Euclidean dissimilarities keeping the simplicity of the classical version. To achieve this goal, the data are nonlinearly transformed to a feature space taking advantage of Mercer kernels, while the overall data structure is preserved. The new SOM algorithm has been applied to the challenging problem of word relation visualization. We report that the kernel SOM improves the map generated by other alternatives for certain classes of kernels.
منابع مشابه
A Geometry Preserving Kernel over Riemannian Manifolds
Abstract- Kernel trick and projection to tangent spaces are two choices for linearizing the data points lying on Riemannian manifolds. These approaches are used to provide the prerequisites for applying standard machine learning methods on Riemannian manifolds. Classical kernels implicitly project data to high dimensional feature space without considering the intrinsic geometry of data points. ...
متن کاملIntegrating the improved CBP model with kernel SOM
In this paper, we first design a more generalized network model, Improved CBP, based on the same structure as Circular BackPropagation (CBP) proposed by Ridella et al. The novelty of ICBP lies in: 1) it substitutes the original extra added node with the isotropic quadratic form input in CBP with the one with an anisotropic quadratic form input; 2) particularly, the weights between the extra nod...
متن کاملTopographic Mapping of Large Dissimilarity Data Sets
Topographic maps such as the self-organizing map (SOM) or neural gas (NG) constitute powerful data mining techniques that allow simultaneously clustering data and inferring their topological structure, such that additional features, for example, browsing, become available. Both methods have been introduced for vectorial data sets; they require a classical feature encoding of information. Often ...
متن کاملDimensionality Reduction via Euclidean Distance Embeddings
This report provides a mathematically thorough review and investigation of Metric Multidimensional scaling (MDS) through the analysis of Euclidean distances in input and output spaces. By combining a geometric approach with modern linear algebra and multivariate analysis, Metric MDS is viewed as a Euclidean distance embedding transformation that converts between coordinate and coordinate-free r...
متن کاملThe Kernel Trick for Distances
A method is described which, like the kernel trick in support vector machines (SVMs), lets us generalize distance-based algorithms to operate in feature spaces, usually nonlinearly related to the input space. This is done by identifying a class of kernels which can be represented as norm-based distances in Hilbert spaces. It turns out that common kernel algorithms, such as SVMs and kernel PCA, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004